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1 Translation spottin g for translation memories I I 
Michel Simard 

May 2003 Proceedings of the HLT-NAACL 2003 Workshop on Building and using 
parallel texts: data driven machine translation and beyond - Volume 3 

Publisher: Association for Computational Linguistics 

Full text available: ^ pdf(150.25 KB ) Additional Information: full citation , abstract , references 

The term translation spotting (TS) refers to the task of identifying the target-language 
(TL) words that correspond to a given set of source-language (SL) words in a pair of text 
segments known to be mutual translations. This article examines this task within the 
context of a sub-sentential translation-memory system, i.e. a translation support tool 
capable of proposing translations for portions of a SL sentence, extracted from an archive 
of existing translations. Different methods are pro ... 

2 Towards a unified approach to memory- and statistical-based machine translation I I 
Daniel Marcu 

July 2001 Proceedings of the 39th Annual Meeting on Association for Computational 
Linguistics ACL '01 

Publisher: Association for Computational Linguistics 

Full text available: pdf (1Q1,49 KB ) Additional Information: full citation , abstract, references 

We present a set of algorithms that enable us to translate natural language sentences by 
exploiting both a translation memory and a statistical-based translation model. Our results 
show that an automatically derived translation memory can be used within a statistical 
framework to often find translations of higher probability than those found using solely a 
statistical model. The translations produced using both the translation memory and the 
statistical model are significantly better than transl ... 



T echnique for automatically correcting words in tex t 
Karen Kukich 

December 1992 ACM Computing Surveys (CSUR), volume 24 issue 4 
Publisher: ACM Press 

Full text available* fiC) pdf (6 23 MB ) Additional Information: full citation , abstract , references , citings, index 

terms, review 

Research aimed at correcting words in text has focused on three progressively more 
difficult problems:(l) nonword error detection; (2) isolated-word error correction; and (3) 
context-dependent work correction. In response to the first problem, efficient pattern- 
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matching and n-gram analysis techniques have been developed for detecting strings that 
do not appear in a given word list. In response to the second problem, a variety of general 
and application-specific spelling cor ... 

Keywords: n-gram analysis, Optical Character Recognition (OCR), context-dependent 
spelling correction, grammar checking, natural-language-processing models, neural net 
classifiers, spell checking, spelling error detection, spelling error patterns, statistical- 
language models, word recognition and correction 



4 Natural lan guage processin g and quer y s ystems: The function of semantics in 
automated lan g ua ge processin g 
Milos Pacak, Arnold W. Pratt 

April 1971 Proceedings of the 1971 international ACM SIGIR conference on 
Information storage and retrieval 

Publisher: ACM Press 

Full text available: fg |pdf(1.30 MB) Additional Information: full citation , abstract , references , citings 

This paper is a survey of some of the major semantic models that have been developed for 
automated semantic analysis of natural language. Current approaches to semantic analysis 
and logical inference are based mainly on models of human cognitive processes such as 
Quillian's semantic memory, Simmon's Protosynthex III and others. All existing systems 
and/or models, more or less experimental, were applied to a small subset of English. They 
are highly tentative because the definitions of semantic pr ... 

Keywords: computational linguistics, grammars, natural language processing, semantics 




Ma chine learnin g com prehension g rammars for ten langua ges Q 

Patrick Suppes, Lin Liang, Michael Bottner 

September 1996 Computational Linguistics, Volume 22 Issue 3 

Publisher: MIT Press 

Full text available: ^ rfjj] 

"[gj pdt(i. 27MB) ^ Additional Information: full citation , abstract , references 
Publisher Site 

Comprehension grammars for a sample often languages (English, Dutch, German, French, 
Spanish, Catalan, Russian, Chinese, Korean, and Japanese) were derived by machine 
learning from corpora of about 400 sentences. Key concepts in our learning theory are: 
probabilistic association of words and meanings, grammatical and semantical form 
generalization, grammar computations, congruence of meaning, and dynamical 
assignment of denotational value to a word. 

A shared, segmented memory system for an obje ct-oriented data base I I 

Mark F. Hornick, Stanley B. Zdonik 

January 1987 ACM Transactions on Information Systems (TOIS), volume 5 issue l 
Publisher: ACM Press 

Full text available: fg l pdf(2.Q5 MB) Additional Information: full citation, abstract , references , citin gs, index 

terms, review 

This paper describes the basic data model of an object-oriented database and the basic 
architecture of the system implementing it. In particular, a secondary storage 
segmentation scheme and a transaction-processing scheme are discussed. The 
segmentation scheme allows for arbitrary clustering of objects, including duplicates. The 
transaction scheme allows for many different sharing protocols ranging from those that 
enforce serializability to those that are nonserializable and require communi ... 
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7 S pecial issue on machine learnin g approaches to shallow parsing: Shallow parsing Q 
using noisy and non-stationary trainin g material 
Miles Osborne 

March 2002 The Journal of Machine Learning Research, Volume 2 
Publisher: MIT Press 

Full text available: W\ pdf( 181.57 KB) Additlonal Information: full citation , abstract , references , citings, index 

terms 

Shallow parsers are usually assumed to be trained on noise-free material, drawn from the 
same distribution as the testing material. However, when either the training set is noisy or 
else drawn from a different distributions, performance may be degraded. Using the parsed 
Wall Street Journal, we investigate the performance of four shallow parsers (maximum 
entropy, memory-based learning, N-grams and ensemble learning) trained using various 
types of artificially noisy material. ... 



8 System descriptions: Hughes Trainable Text Skimmer: description of the TTS system Q 
as used for MUC-3 

Charles P. Dolan, Thomas V. Cuda, Seth R. Goldman, Alan M. Nakamura 

May 1991 Proceedings of the 3rd conference on Message understanding MUC3 '91 

Publisher: Association for Computational Linguistics 

Full text available: g pdf( 435.49 KB) Additional Information: full citation , abstract 

The objective of the Hughes Trainable Text Skimmer (TTS) Project is to create text 
skimming software that: (1) can be easily re-configured for new applications, (2) improves 
its performance with use, and (3) is fast enough to process megabytes of text per day. 
The TTS-MUC3 system is our first full scale prototype. 



Memory utilizat i on e fficiency under a class of first-fit al g orithms I I 

Aaron Tenenbaum 

January 1980 Proceedings of the ACM 1980 annual conference 
Publisher: ACM Press 

Full text available: jgpdf( 413.7 7 KB) Additional Information: full citation , a b s tract, references, i ndex terms 

This paper examines an improved version of a modified first-fit storage allocation 
algorithm. In this version, small blocks of free storage are not permitted to remain on the 
free list but instead are placed on a separate sliver list, available for recombination with 
newly freed blocks. The memory utilization efficiency of a system under this algorithm is 
shown to be markedly superior to a system using an algorithm in which such blocks are 
unavailable for either allocation or recombination. ... 



A scal able mark- sw e ep garbage collector on la rge- sca l e sh ar ed-memory machines I I 
Toshio Endo, Kenjiro Taura, Akinori Yonezawa 

November 1997 Proceedings of the 1997 ACM/IEEE conference on Supercomputing 
(CDROM) 

Publisher: ACM Press 

Full text available: ^| pdf(96.62 KB) Additional Information: full citation , abstract , references , citings 

This work describes implementation of a mark-sweep garbage collector (GC) for shared- 
memory machines and reports its performance. It is a simple "parallel" collector in which 
all processors cooperatively traverse objects in the global shared heap. The collector stops 
the application program during a collection and assumes a uniform access cost to all 
locations in the shared heap. Implementation is based on the Boehm-Demers-Weiser 
conservative GC (Boehm GC). Experiments have been done on Ultra ... 

Keywords: dynamic load balancing, garbage collection, parallel algorithm, scalability, 
shared-memory machine 
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Information stora g e and retrieval: a survey and functional description 
Jack Minker 

September 1977 ACM SIGIR Forum, volume 12 issue 2 
Publisher: ACM Press 

Full text available: ^T] pdf(5.14 MB) Additional Information: full citation , abstract , references 

Information Storage and Retrieval (IS&R) encompasses a broad scope of topics ranging 
from basic techniques for accessing data to sophisticated approaches for the analysis of 
natural language text and the deduction of information. Within the field, three general 
areas of investigation can be distinguished not only by their subject matter but also by the 
types of individuals presently interested in them:(l) Document retrieval,(2) Generalized 
data management, and(3) Question-answering. A functional ... 

Keywords: automatic indexing, data management, data structures, deductive search, 
information retrieval, natural language, problem solving, question-answering, relational 
data systems, theorem proving 
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12 Basic elements of COBOL 61 I I 

Jean E. Sammet '™ 
May 1962 Communications of the ACM, volume 5 issue 5 
Publisher: ACM Press 

Full text available: ^g) pdf(1.70 MB) Additional Information: full citation , references , citings 




13 The FINITE STRING newsletter: Abstracts of current literature I I 

American Journal of Computational Linguistics Staff 
October 1981 Computational Linguistics, volume 7 issue 4 
Publisher: MIT Press 

Full text available: ^ MA ^ A rfjj| 

TgM(lMMB)_v Additional Information: full citation 
Publisher Site 



Compiling nested data-parallel programs for shared-memory multiprocessors I I 

Siddhartha Chatterjee 

July 1993 ACM Transactions on Programming Languages and Systems (TOPLAS), 

Volume 15 Issue 3 
Publisher: ACM Press 

Full text available: IB pdf( 4.17MB) Additional Information: fujj citation, references, citings, jnjcfex.terms, 

review 



Keywords: compilers, data parallelism, shared-memory multiprocessors 



Fast detection of communication patterns in distributed executions 
Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced 
Studies on Collaborative research 

Publisher: IBM Press 

Full text available: ^|pdf( 4.21 MB) Additional Information: full citation , abstract , references , index terms 
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Understanding distributed applications is a tedious and difficult task. Visualizations based 
on process- time diagrams are often used to obtain a better understanding of the execution 
of the application. The visualization tool we use is Poet, an event tracer developed at the 
University of Waterloo. However, these diagrams are often very complex and do not 
provide the user with the desired overview of the application. In our experience, such tools 
display repeated occurrences of non-trivial commun ... 

Parallel execution of prolog pro g rams: a surve y 

Gopal Gupta, Enrico Pontelli, Khayri A.M. Ali, Mats Carlsson, Manuel V. Hermenegildo 
July 2001 ACM Transactions on Programming Languages and Systems (TOPLAS), 

Volume 23 Issue 4 
Publisher: ACM Press 

Full text available: f£| pdf(1 95 MB) Additional Information: full citation , abstract , references , citings , index 
L-j ~ terms 

Since the early days of logic programming, researchers in the field realized the potential 
for exploitation of parallelism present in the execution of logic programs. Their high-level 
nature, the presence of nondeterminism, and their referential transparency, among other 
characteristics, make logic programs interesting candidates for obtaining speedups 
through parallel execution. At the same time, the fact that the typical applications of logic 
programming frequently involve irregular computatio ... 

Keywords: Automatic parallelization, constraint programming, logic programming, 
parallelism, prolog 



17 A phrase-based, joint probabili t y model for sta tist ical machine translation I I 
Daniel Marcu, William Wong 

July 2002 Proceedings of the ACL-02 conference on Empirical methods in natural 
language processing - Volume 10 EMNLP '02 

Publisher: Association for Computational Linguistics 

Full text available: ^ pdf(96.49 KB) Additional Information: full citation , abstract , references 

We present a joint probability model for statistical machine translation, which 
automatically learns word and phrase equivalents from bilingual corpora. Translations 
produced with parameters estimated using the joint model are more accurate than 
translations produced using IBM Model 4. 

18 Sheaved memory: architectural support for state saving and restoration in pages [_J 
systems 

^ M. E. Staknis 

April 1989 ACM SIGARCH Computer Architecture News , Proceedings of the third 
international conference on Architectural support for programming 
languages and operating systems ASPLOS-III, volume 17 issue 2 

Publisher: ACM Press 

Full text available* 1|3 pdf(973 26 KB) Add ' tiona l information: full citation , abstract , references , citin gs, index 
l^j , terms 

The concept of read-one/write-many paged memory is introduced and given the name 
sheaved memory. It is shown that sheaved memory is useful for efficiently maintaining 
checkpoints In main memory and for providing state saving and state restoration for 
software that includes recovery blocks or similar control structures. The organization of 
sheaved memory is described in detail, and a design is presented for a prototype sheaved- 
memory module that can be built easily from inex ... 
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Hari Sundaram, Shih-Fu Chang 

October 2000 Proceedings of the eighth ACM international conference on Multimedia 
Publisher: ACM Press 

Full text available- fill pdf(924 83 KB) Addit ' ona ' Information: full citation , abstract , references , citings, index 

terms 

In this paper we present novel algorithms for computing scenes and within-scene 
structures in films. We begin by mapping insights from film-making rules and experimental 
results from the psychology of audition into a computational scene model. We define a 
computable scene to be a chunk of audio-visual data that exhibits long-term consistency 
with regard to three properties: (a) chromaticity (b) lighting (c) ambient sound. Central to 
the computational model is the notion of a causal, finite-me ... 

Keywords: computable scenes, films, memory models, periodic analysis transform, scene 
detection, shot-level structure 



Inverted files for text search engines 
Justin Zobel, Alistair Moffat 

July 2006 ACM Computing Surveys (CSUR), volume 38 issue 2 
Publisher: ACM Press 

Full text available: ■ g) pdf(944.29 KB) Additional Information: full citation , abstract , references , index terms 

The technology underlying text search engines has advanced dramatically in the past 
decade. The development of a family of new index representations has led to a wide range 
of innovations in index storage, index construction, and query evaluation. While some of 
these developments have been consolidated in textbooks, many specific techniques are 
not widely known or the textbook descriptions are out of date. In this tutorial, we 
introduce the key techniques in the area, describing both a core impl ... 

Keywords: Inverted file indexing, Web search engine, document database, information 
retrieval, text retrieval 
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